terminal cost
High-dimensional Mean-Field Games by Particle-based Flow Matching
Yu, Jiajia, Lee, Junghwan, Xie, Yao, Cheng, Xiuyuan
Mean-field games (MFGs) study the Nash equilibrium of systems with a continuum of interacting agents, which can be formulated as the fixed-point of optimal control problems. They provide a unified framework for a variety of applications, including optimal transport (OT) and generative models. Despite their broad applicability, solving high-dimensional MFGs remains a significant challenge due to fundamental computational and analytical obstacles. In this work, we propose a particle-based deep Flow Matching (FM) method to tackle high-dimensional MFG computation. In each iteration of our proximal fixed-point scheme, particles are updated using first-order information, and a flow neural network is trained to match the velocity of the sample trajectories in a simulation-free manner. Theoretically, in the optimal control setting, we prove that our scheme converges to a stationary point sublinearly, and upgrade to linear (exponential) convergence under additional convexity assumptions. Our proof uses FM to induce an Eulerian coordinate (density-based) from a Lagrangian one (particle-based), and this also leads to certain equivalence results between the two formulations for MFGs when the Eulerian solution is sufficiently regular. Our method demonstrates promising performance on non-potential MFGs and high-dimensional OT problems cast as MFGs through a relaxed terminal-cost formulation.
Fixed Horizon Linear Quadratic Covariance Steering in Continuous Time with Hilbert-Schmidt Terminal Cost
Sial, Tushar, Halder, Abhishek
We formulate and solve the fixed horizon linear quadratic covariance steering problem in continuous time with a terminal cost measured in Hilbert-Schmidt (i.e., Frobenius) norm error between the desired and the controlled terminal covariances. For this problem, the necessary conditions of optimality become a coupled matrix ODE two-point boundary value problem. To solve this system of equations, we design a matricial recursive algorithm and prove its convergence. The proposed algorithm and its analysis make use of the linear fractional transforms parameterized by the state transition matrix of the associated Hamiltonian matrix. To illustrate the results, we provide two numerical examples: one with a two dimensional and another with a six dimensional state space.
Bridging Finite and Infinite-Horizon Nash Equilibria in Linear Quadratic Games
Salizzoni, Giulio, Hall, Sophie, Kamgarpour, Maryam
Finite-horizon linear quadratic (LQ) games admit a unique Nash equilibrium, while infinite-horizon settings may have multiple. We clarify the relationship between these two cases by interpreting the finite-horizon equilibrium as a nonlinear dynamical system. Within this framework, we prove that its fixed points are exactly the infinite-horizon equilibria and that any such equilibrium can be recovered by an appropriate choice of terminal costs. We further show that periodic orbits of the dynamical system, when they arise, correspond to periodic Nash equilibria, and we provide numerical evidence of convergence to such cycles. Finally, simulations reveal three asymptotic regimes: convergence to stationary equilibria, convergence to periodic equilibria, and bounded non-convergent trajectories. These findings offer new insights and tools for tuning finite-horizon LQ games using infinite-horizon.
Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance
Nakashima, Haruto, Ganguly, Siddhartha, Kashima, Kenji
-- We tackle the data-driven chance-constrained density steering problem using the Gromov-Wasserstein metric. The underlying dynamical system is an unknown linear controlled recursion, with the assumption that sufficiently rich input-output data from pre-operational experiments are available. The initial state is modeled as a Gaussian mixture, while the terminal state is required to match a specified Gaussian distribution. We reformulate the resulting optimal control problem as a difference-of-convex program and show that it can be efficiently and tractably solved using the DC algorithm. The term data-driven has become increasingly prevalent in the modern control literature [1].
Infinite-Horizon Value Function Approximation for Model Predictive Control
Jordana, Armand, Kleff, Sรฉbastien, Haffemayer, Arthur, Ortiz-Haro, Joaquim, Carpentier, Justin, Mansard, Nicolas, Righetti, Ludovic
Model Predictive Control has emerged as a popular tool for robots to generate complex motions. However, the real-time requirement has limited the use of hard constraints and large preview horizons, which are necessary to ensure safety and stability. In practice, practitioners have to carefully design cost functions that can imitate an infinite horizon formulation, which is tedious and often results in local minima. In this work, we study how to approximate the infinite horizon value function of constrained optimal control problems with neural networks using value iteration and trajectory optimization. Furthermore, we demonstrate how using this value function approximation as a terminal cost provides global stability to the model predictive controller. The approach is validated on two toy problems and a real-world scenario with online obstacle avoidance on an industrial manipulator where the value function is conditioned to the goal and obstacle.
RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control
Rout, Litu, Chen, Yujia, Ruiz, Nataniel, Kumar, Abhishek, Caramanis, Constantine, Shakkottai, Sanjay, Chu, Wen-Sheng
We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and (c) effective composition of style and content. RB-Modulation is built on a novel stochastic optimal controller where a style descriptor encodes the desired attributes through a terminal cost. The resulting drift not only overcomes the difficulties above, but also ensures high fidelity to the reference style and adheres to the given text prompt. We also introduce a cross-attention-based feature aggregation scheme that allows RB-Modulation to decouple content and style from the reference image. With theoretical justification and empirical evidence, our framework demonstrates precise extraction and control of content and style in a training-free manner. Further, our method allows a seamless composition of content and style, which marks a departure from the dependency on external adapters or ControlNets.
Koopman Data-Driven Predictive Control with Robust Stability and Recursive Feasibility Guarantees
de Jong, Thomas, Breschi, Valentina, Schoukens, Maarten, Lazar, Mircea
In this paper, we consider the design of data-driven predictive controllers for nonlinear systems from input-output data via linear-in-control input Koopman lifted models. Instead of identifying and simulating a Koopman model to predict future outputs, we design a subspace predictive controller in the Koopman space. This allows us to learn the observables minimizing the multi-step output prediction error of the Koopman subspace predictor, preventing the propagation of prediction errors. To avoid losing feasibility of our predictive control scheme due to prediction errors, we compute a terminal cost and terminal set in the Koopman space and we obtain recursive feasibility guarantees through an interpolated initial state. As a third contribution, we introduce a novel regularization cost yielding input-to-state stability guarantees with respect to the prediction error for the resulting closed-loop system. The performance of the developed Koopman data-driven predictive control methodology is illustrated on a nonlinear benchmark example from the literature.
Risk-Aware Non-Myopic Motion Planner for Large-Scale Robotic Swarm Using CVaR Constraints
Yang, Xuru, Hu, Yunze, Gao, Han, Ding, Kang, Li, Zhaoyang, Zhu, Pingping, Sun, Ying, Liu, Chang
Swarm robotics has garnered significant attention due to its ability to accomplish elaborate and synchronized tasks. Existing methodologies for motion planning of swarm robotic systems mainly encounter difficulties in scalability and safety guarantee. To address these limitations, we propose a Risk-aware swarm mOtion planner using conditional ValuE at Risk (ROVER) that systematically navigates large-scale swarms through cluttered environments while ensuring safety. ROVER formulates a finite-time model predictive control (FTMPC) problem predicated upon the macroscopic state of the robot swarm represented by a Gaussian Mixture Model (GMM) and integrates conditional value-at-risk (CVaR) to ensure collision avoidance. The key component of ROVER is imposing a CVaR constraint on the distribution of the Signed Distance Function between the swarm GMM and obstacles in the FTMPC to enforce collision avoidance. Utilizing the analytical expression of CVaR of a GMM derived in this work, we develop a computationally efficient solution to solve the non-linear constrained FTMPC through sequential linear programming. Simulations and comparisons with representative benchmark approaches demonstrate the effectiveness of ROVER in flexibility, scalability, and risk mitigation.
Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control
Lu, Yiwen, Li, Zishuo, Zhou, Yihan, Li, Na, Mo, Yilin
In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.
Eco-Driving Control of Connected and Automated Vehicles using Neural Network based Rollout
Paugh, Jacob, Zhu, Zhaoxuan, Gupta, Shobhit, Canova, Marcello, Stockar, Stephanie
Connected and autonomous vehicles have the potential to minimize energy consumption by optimizing the vehicle velocity and powertrain dynamics with Vehicle-to-Everything info en route. Existing deterministic and stochastic methods created to solve the eco-driving problem generally suffer from high computational and memory requirements, which makes online implementation challenging. This work proposes a hierarchical multi-horizon optimization framework implemented via a neural network. The neural network learns a full-route value function to account for the variability in route information and is then used to approximate the terminal cost in a receding horizon optimization. Simulations over real-world routes demonstrate that the proposed approach achieves comparable performance to a stochastic optimization solution obtained via reinforcement learning, while requiring no sophisticated training paradigm and negligible on-board memory.